Genome-­wide scan of pulmonary phenotypes on local ancestry ⟶ genes interacting with smoking

Andrey Ziyatdinov, PostDoctoral Fellow at HSPH

September 27, 2017
Statistical Genetics Meeting
Channing Division of Network Medicine

Outline

  • Background and projects goals
  • Interaction model on local ancestry
  • Association results
  • Post-association analysis

Gene-by-smoking interaction scans

Recent scans on SNP-by-smoking on pulmonary phenotypes

Ref. Study N Outcome Test p < 5e-08
(Hancock et al. 2012) Meta (19) 50K FEV1 joint (Aschard et al. 2011) nearby 3 genes
(Wain et al. 2015) UKBiobank 50K FEV1/FEV1-FVC interaction


Recent proposals to improve power by aggregating variants

  1. Grouping into Genetic Risk Scores (GRS) (Aschard et al. 2017)
  2. Using the ancestry information (Aschard et al. 2015), (Park et al. 2016)

The COPD-related outcomes are appropriate for the 2nd approach, as the proportion of African ancestry was shown to be associated with the risk of COPD (Kumar et al. 2010)

Admixed genome

Leaveraging ancestry information

COPDGene African Americans (AA) dataset

  • 3.3K African Americans from the COPDGene project
  • 7 quantitative & 1 binary outcomes
    • FEV1/FEV1pp, FVC/FVCpp, FEV1_FVC, pctEmph_Slicer, TLCpp, finalGold
  • binary exposures
    • SmokCigNow, CigPerDayNow > 15, CompletedSchool > 2

37K long (>10kb) local ancestry segments (Parker et al. 2014)

Project goals

  1. Prove: local ancestry → gene-by-environment interactions
  2. Follow-up the ancestry-based findings
    • SNPs: fine-mapping
    • Enrichment analysis
    • Check Gene Expression and Methylation in gene candidate regions

Outline

  • Background and projects goals
  • Interaction model on local ancestry
  • Association results
  • Post-association analysis

First association model

  • marginal effect: \(y \sim a_g + a_l\)
  • interaction effect: \(y \sim a_g + a_l + x_e + a_g * x_e + a_e * x_e\)

Confounding factors other that global ancestry \(a_g\):

  • trait-sepcific covariates, e.g.
    • FEV1 ~ Age + Age^2 + Gender + Height + PackYears + SmokCigNow
  • random effect on medical centers (\(\approx\) 5% of variance)
    • random effect of medical device for pctEmph_Slicer trait

This (marginal) model was used in (Parker et al. 2014).
Is it OK for interaction?

First QQ plots: marginal & interaction

Our approach to fix model misspecification

  1. Ancestry Relatedness Matrix (ARM) (Zaitlen et al. 2014)
  2. Another (EARM) for ancestry-by-exposure component (Sul et al. 2016)
  3. Modeling heteroskedasticity (Don’t depreciate exploratory plots!)
  4. Selection of smoking covariates
    • SmokCigNow + ATS_PackYearsDuration_Smoking + log_CigPerDaySmokAvg + SmokCigNow + SmokCigNow0_15 + SmokCigarNow

More details in our previous talk COPDGene African-Americans & QQ plots

Heteroskedasticity & Covariance Matrices

Results: Clean QQ plots (marginal)

Results: Nearly Clean QQ plots (interaction)

Outline

  • Background and projects goals
  • Interaction model on local ancestry
  • Association results
  • Post-association analysis

Ancestry-by-SmokCigNow (7 traits)

Zoom in Chr 11 (2 repetetive traits)

Multi-trait joined test (5 traits)

\(z = [z_1; z_2; \dots]^T \sim N(0, \Sigma)\)
under the null hypothesis

  • Estimate covariance pairs
    • truncated normal
    • threshold 2.5
  • Apply the Omnibus test

Results: Omnibus test (5 traits)

Results: Top Genes

Bonferroni 0.05 / 37K = 1.4e-06


Ancestry segment: 11:12,332,105 - 12,394,102
Genes: PARVA, MICAL2, MICALCL, RASSF10, TEAD1
Trait Exposure z-score p-value
FEV1pp SmokCigNow 4.5 7.3e-06
Omnibus SmokCigNow 5.7e-05
FEV1_FVC SmokCigNow 3.9 1.1e-04
FEV1 SmokCigNow 3.8 1.4e-04

Ancestry segment: 2:238,819,792 - 238,904,351
Genes: TWIST2, HDAC4, MIR4440, MIR4441

Trait Exposure z-score p-value
FEV1 SmokCigNow0_15 4.2 2.6e-05
FEV1pp SmokCigNow0_15 4.1 4.9e-05
FVC SmokCigNow0_15 4.0 6.9e-05
Omnibus SmokCigNow_15 1.5e-04
FVCpp SmokCigNow0_15 3.7 2.5e-04

Outline

  • Background and projects goals
  • Interaction model on local ancestry
  • Association results
  • Post-association analysis (ongoing work)

The effective number of tests

Enrichment analysis

  1. Gene-set enrichment
  2. Functional or tissue-specific enrichemnt
    • SNP-level resolution


Data Min size Mean size
Local ancestry 10,000 bp 13,000 bp
ENCODE tissue-specific 150 bp
Intersection 10,000 bp 11,000 bp

Leaveraging Methylation data

We observed top associated genes have been published to be associated with epigenetic changes.

Hypothesis: the mechanism of ancestry-based association is:
Smoking → Up/Down Methylation → COPD-related phenotype

Wan et al., Smoking-associated site-specific differential methylation in buccal mucosa in the COPDGene study (2015)

References

Aschard et al. 2011. “Genome-wide meta-analysis of joint tests for genetic and gene-environment interaction effects.” Human Heredity 70 (4): 292–300. doi:10.1159/000323318.

———. 2015. “Leveraging local ancestry to detect gene-gene interactions in genome-wide data.” BMC Genetics 16 (1). BMC Genetics: 124. doi:10.1186/s12863-015-0283-z.

———. 2017. “Evidence for large-scale gene-by-smoking interaction effects on pulmonary function.” International Journal of Epidemiology 46 (3): 894–904. doi:10.1093/ije/dyw318.

Hancock et al. 2012. “Genome-Wide Joint Meta-Analysis of SNP and SNP-by-Smoking Interaction Identifies Novel Loci for Pulmonary Function.” PLoS Genetics 8 (12). doi:10.1371/journal.pgen.1003098.

Kumar et al. 2010. “Genetic Ancestry in Lung-Function Predictions.” New England Journal of Medicine 363 (4): 321–30. doi:10.1056/NEJMoa0907897.

Park et al. 2016. “An Ancestry Based Approach for Detecting Interactions.”

Parker et al. 2014. “Admixture mapping identifies a quantitative trait locus associated with FEV1/FVC in the COPDGene Study.” Genetic Epidemiology 38 (7): 652–59. doi:10.1002/gepi.21847.

Renier et al. 2017. “HHS Public Access” 165 (7): 1789–1802. doi:10.1016/j.cell.2016.05.007.Mapping.

Sul et al. 2016. “Accounting for Population Structure in Gene-by-Environment Interactions in Genome-Wide Association Studies Using Mixed Models.” PLoS Genetics 12 (3): e1005849. doi:10.1371/journal.pgen.1005849.

Wain et al. 2015. “Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): A genetic association study in UK Biobank.” The Lancet Respiratory Medicine 3 (10): 769–81. doi:10.1016/S2213-2600(15)00283-0.

Zaitlen et al. 2014. “Leveraging population admixture to characterize the heritability of complex traits.” Nature Genetics 46 (12). Nature Publishing Group: 1356–62. doi:10.1038/ng.3139.